Selective Term Proximity Scoring Via BP-ANN

نویسندگان

  • Ju Yang
  • Jiancong Tong
  • Rebecca J. Stones
  • Zhaohua Zhang
  • Benjun Ye
  • Gang Wang
  • Xiaoguang Liu
چکیده

When two terms occur together in a document, the probability of a close relationship between them and the document itself is greater if they are in nearby positions. However, ranking functions including term proximity (TP) require larger indexes than traditional document-level indexing, which slows down query processing. Previous studies also show that this technique is not effective for all types of queries. Here we propose a document ranking model which decides for which queries it would be beneficial to use a proximity-based ranking, based on a collection of features of the query. We use a machine learning approach in determining whether utilizing TP will be beneficial. Experiments show that the proposed model returns improved rankings while also reducing the overhead incurred as a result of using TP statistics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Short Note on Proximity-based Scoring of Documents with Multiple Fields

Œe BM25 ranking function is one of the most well known query relevance document scoring functions and many variations of it are proposed. Œe BM25F function is one of its adaptations designed formodeling documentswithmultiple fields. Œe Expanded Span method extends a BM25-like function by taking into considerations of the proximity between term occurrences. In this note, we combine these two var...

متن کامل

Efficient Text Proximity Search

In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches on effective scoring functions that incorporate proximity, there has not been much work on algorithms or access methods for their efficient evaluation. This paper presents an efficient evaluation fra...

متن کامل

Term Proximity Scoring for Keyword-Based Retrieval Systems

This paper suggests the use of proximity measurement in combination with the Okapi probabilistic model. First, using the Okapi system, our investigation was carried out in a distributed retrieval framework to calculate the same relevance score as that achieved by a single centralized index. Second, by applying a term-proximity scoring heuristic to the top documents returned by a keyword-based s...

متن کامل

Modelling of Conventional and Severe Shot Peening Influence on Properties of High Carbon Steel via Artificial Neural Network

Shot peening (SP), as one of the severe plastic deformation (SPD) methods is employed for surface modification of the engineering components by improving the metallurgical and mechanical properties. Furthermore artificial neural network (ANN) has been widely used in different science and engineering problems for predicting and optimizing in the last decade. In the present study, effects of conv...

متن کامل

Heading-Aware Proximity Measure and Its Applica- tion to Web Search

Proximity of query keyword occurrences is one important evidence which is useful for effective querybiased document scoring. If a query keyword occurs close to another in a document, it suggests high relevance of the document to the query. The simplest way to measure proximity between keyword occurrences is to use distance between them, i.e., difference of their positions. However, most web pag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1606.07188  شماره 

صفحات  -

تاریخ انتشار 2016